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(57) Abstract 

A communication system (2000) is disclosed. One embodiment includes at least two functional blocks, wherein a first functional 
block (1002) communicates with a second functional block (1008) by establishing a connection, wherein the connection is a logical state 
in which data may pass between the first functional block (1002) and the second functional block (1008). Another embodiment includes a 
bus coupled to each of the functional blocks and configured to carry a plurality of signals. The plurality of signals includes a connection 
identifier that indicates a particular connection that a data transfer is part of, and a thread identifier that indicates a transaction stream that 
the data transfer is part of. " 
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COMMUNICATIONS SYSTEM AND METHOD WITH MULTILEVEL 
CONNECTION IDENTIFICATION 

FIELD OF THE INVENTION 

The present invention relates to a communication system to couple 
computing sub-systems. 
BACKGROUND OF THE INVENTION 

Electronic computing and communications systems continue to include 
greater numbers of features and to increase in complexity. At the same time, 
electronic computing and communications systems decrease in physical size and 
cost per function. Rapid advances in semiconductor technology such as four-layer 
deep-sub-micron complimentary metal-oxide semiconductor (CMOS) technology, 
have enabled true "system-on-a-chip" designs. These complex designs may 
incorporate, for example, one or more processor cores, a digital signal processing 
(DSP) core, several communications interfaces, and graphics support in application- 
specific logic. In some systems, one or several of these extremely complex chips 
must communicate with each other and with other system components. Significant 
new challenges arise in the integration, verification and testing of such systems 
because efficient corr^unic^,r^n,must ^e v place ; between sub-systems on a single 
complex c^ip as wey,as.^j^e^5 f*Hg£ 95 a .^t^m.b^ard, Qne.benefit to having an 
efficient and flexible methods/or communication between sub-systems and chips is 
that system components can be reused in other systems with a minimum of 
redesign. t ,. :<i ..;T^' ^ - 

6ne challenge .in^^^m&egraSbn, verification ax\d testing of modern . V 
electronic systems stems from the fact that modem electronic systems in many-' 
application areas have functippaHty, cost and form-factor requirements that 
mandate the sharing of resources, such as memory, among multiple functional 
blocks, where functional bltidgs can be any entity that interfaces to a communication 



WO 00/29961 



• PCT/US$9/26901 



system. In such systems, the functional blocks typically possess different 
performance characteristics and requirements, and the communications system and 
shared resources must simultaneously satisfy the total requirements. Key 
requirements of typical functional blocks are bandwidth and latency constraints 
that can vary over several orders of magnitude between functional blocks. In order 
to simultaneously satisfy constraints that vary so widely, communications systems 
must provide high degrees of predictability. 

Traditional approaches to the design of communications systems for 
modern, complex computet systems have various strengths and weaknesses. An 
essential aspect of such approaches is the communications interface that various 
sub-systems present to one another. One approach is to define customized point- 
to-point interfaces between a sub-system and eaich peer with which it must 
communicate. This customized approach offers protocol simplicity, guaranteed 
performance, and isolation from dependencies on unrelated sub-systems. 
Customized interfaces, however, are by their nature inflexible. The addition of a 
new sub-system with a different interface requires design rework. 

A second approach is to define a system using standardized interfaces. 
Many standardized interfaces are based 'on pri^stablished computer bus protocols. 
The use of computer buses allows flexibility in system design, since as many 
different functional blocks may be connected together as required by the system, as 
long as the bus has sufficient perforriiance. It is also ( necessary to allocate access to 
the bus among various sub-systems. In the case of computer buses, resource 
allocation is typically referred to as arbitration. 

One disadvantage of computer buses is that each sub-system or component 
connected to the bus is constrained to use the protocol of the bus. In some cases, this 
limits the performance of the sub-system. For example, a sub-system may be 
capable of handling multiple transaction streams simultaneously, but the bus 
protocol is not capable of fully supporting concurrent operations. In the case of a 
sub-system handling multiple transaction streams where each transaction stream 
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has ordering constraints, it is necessary foj .the. sub-system to identify each- .... . ; 
increment of data received or irar^putted with a certain part of a certain data 
stream to distinguish bety/e^ streams and to: preserve order within a stream. This 
includes identifying a sub-^ter^ that is a source of a .data transmission. 
Conventionally, su^id^tifiratipji i^ limited to a t non-configurable hardware 
identifier that is generated a p.arti^ul^r subsystem or t component., . 

Current bus systems provide limited capability to preserve order in one 
transaction stream bjrsupporfog^ m which data from one 

transaction may be interleaved with data from another transaction in the same 
stream. In such a bus, data is tagged a? to one stream of data, so that it 

can be identified even if jt arrives out of order,. This requires the receiving sub- 
system to decode an arriving address, to extract the identification information. 

Current bus systems dp not^upport true concurrency of operations for a 
sub-system that can process m^tipte streams of transactions over a single 
interconnect, such as a memory controller that handles access to a single dynamic 
random access memory (DRAM), for several clients of the DRAM. A DRAM 
controller may require information related to a source of an access request, a 
priority of an access request, ordering requirements, etc Current communication 
systems do not provide fpr such information to be transmitted with data without 
placing an additional burden, on l the sutKsystem to ad^ the existing protocol. 

In order for many sub-^yste*^ in conventional systems using all 

of their capabUities, adcUtional tawledge must be designed into the sub-systems to 
provide communication over existing communication systems. This makes sub- 
systems more expensive .and less flexible; in the event the sub-system is later 
required to communicate with new sub-systems or components. Existing 
communication approaches thus do not meet the requirements of today's large, 
complex electronics systems. Therefore, it is desirable for a communications system 
and mechanism to allow sub-systems of a large, complex electronics system to 
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inter-operate efficiently regardless of their varying performance characteristics and 
requirements. ' ' " ' 

SUMMARY OF THE INVENTION 

One embodiment of the present invention includes a shared communications 
bus for providing flexible communication capabili ty between electronic sub- 
systems. One embodiment includes a prbtdcbl that allows for identification of data 
transmissions at different levels of detail as required by a particular sub-system 
without additional knowledge being deigned into the sub-system. 

One embodiment of the invention includes several functional blocks, 
including at least one initiator functional block and one target functional block. 
Some initiator functional blocks may also function as target functional blocks. In 
one embodiment, the initiator functional block is coupled to an initiator interface 
module and the target functional block is coupled to a target interface module. The 
initiator functional block and the target functional block communicate to their 
- respective interface modules and the interface modules communicate? with each 
other. The initiator functional block communicates with the target functional block 
by establishing a connection, wherein a connection is a logical state in which data 
may pass between the initiator functional block and the target functional block. 

One embodiment aiso includes a bus configured to carry multiple signals, 
wherein the signals include a connection identifier signal that indicates a particular 
connection that a data transfer between an initiator functional block and a target 
functional block is part of- The connection identifier includes informatiori about the 
connection, such as which functional block is the source of a transmission, a priority 
of a transfer request, and transfer ordering information. One embodiment also 
includes a thread identifier, which provides a subset of the information provided by 
the connection identifier. In one embodiment, the thread identifier is an identifier 
of local scope that identifies transfers between an interface module and a connected 
functional block, where in some embodiments, an interface module connects a 
functional block to a shared communications bus. 

-4- 



FCT/US99/2$901 



_ The correction identi^^ 
information between interface modules onbetween functional blocks through their 
interface modules. Some functional blocks may require all the information 
provided by the connection identifier, while other functional blocks may require 
only the subset of information provided by the thread identifier. 
BRIEF DESCRtPTIONOF T^DRawtn^ 

Figure 1 is a block diagram 
system according to the present invention. 

Figure 2 is an embodiment of a system module. 

Figure 3 is an embodiment of a system module. 

Figure 4 is an embodiment of > communications bus. 

Figure 5 is a timing diagram showing pipelined write transfers. 

Figure 6 is a timing diagram showing rejection of a first pipelined write 
transfer and a successful second write transfer 

Fi8Ur ** is 3 timin f diagram ^owing_^erleaym§_ of pipelined read and 
write transfers?' •> > 

Figure 8 is a timing diagram showing interleaved connections to a single 
"target. ■ * ' " " '" 

Figure 9 is a timing diagram showing interleaved connections from a single 
mibator. 

• .. ■■ . ./..u .-i: i iins\ ! :>i)}. :i r .\ . * , .', • " : ••■ v •• ■ 

Figure 10 is a block diagram of one embodiment of part of a computer 

system. 

Figure 11 is one embodiment of a communications bus. 

Figure 12 is a block diagram of one embodiment of part of a computer 
system. " ' •" "'" '"' 
DETAILED DESCRiPTtON ''''' ^ 

The. present invention is a communications system and method for allowing 
multiple functional blocks or sub-systems of a complex electronics system to 
communicate with each other through a shared communications resource, such as a 
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shared communications bus. In one embodiment a communications protocol 
allows multiple functional block on a single semiconductor device to communicate 
to each other. In another embodiment, the communications protocol may be used 
to allow multiple functional blocks on different semiconductor devices to 
communicate to each other through a shared off-chip communications resource, 
such as a bus. 

In one embodiment, the present invention is a pipelined communications bus 
with separate command, address, and data wires. Alternative embodiments 
include a pipelined communications bus with multiplexed address, data, and 
control signals. The former embodiment offers higher performance and simpler 
control than the latter embodiment at the expense of extra wires. The former 
embodiment may be more appropriate for on-chip communications, where wires 
are relatively less expensive and performance requirements are usually higher. The 
latter embodiment offers higher per-wire transfer efficiency, because it shares the 
same wires among address and data transfers. The latter embodiment may be more 
appropriate for chip-to-chip communications between semiconductor devices, 
because package pins and board traces increase the per signal cost, while total 
required communications performance is usually lower. 

Figure 1 is a block diagram of a complex electronics system lOO. Shared 
communications bus 112 connects sub-systems 102, 104, 106, 108, and 110. Sub- 
systems are typically functional blocks including a interface module for interfacing 
to a shared bus. Sub-systems may themselves include one or more functional 
blocks and may or may not include an integrated or physically separate interface 
module. In one embodiment, the sub-systems connected by communications bus 
112 are separate integrated circuit chips. Sub-system 104 is an application specific 
integrated circuit (ASIC) which, as is known, is an integrated circuit designed to 
perform a particular function. Sub-system 106 is a dynamic random access memory 
(DRAM). Sub-system 108 is an erasable, programmable, read only memory 
(EPROM). Sub-system 110 is a field programmable gate array (FPGA). Sub-system 
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J s a ^ ulI y custom integrated circuit designed specifically to operate in system 
100. Other embodiments may contain additional subsystems of the same types as 
shown, or other types not shown. Other embodiments may also include fewer sub- 
systems than the sub-systems shown in system 100. Integrated circuit 102 includes 
sub-systems 102A, 102B, 10^cj02D^d 102E. ASIC 104 includes functional blocks 
101A, 1MB and 104C FPGA 110 includes functional blocks 110A and HOB. A 
functional block may be a particular block of logic that performs a particular 
function. A functional block may also be a memory component on an integrated 
circuit. 

System 100 is an example of a system that may consist of one or more 
integrated circuits or chips. A functional block may be a logic block on an 
integrated circuit such as, for example, functional block 102E, or a functional block 
may also be an integrated circuit such as fully custom integrated circuit 102 that 
implements a single logic function. 

Shared coriunurrications bus 112 provides a shared communications bus 
between sub-systems of system 100. Shared communication bus 114 provides a 
shared communications bus between sub-systems or functional blocks on a single 
integrated circuit. Some of the functional blocks shown are connected to interface 
modules through which the^ send and receive signals to and from shared 
communications bus 112 or shared communications bus 114. Interconnect 115 is a 
local point-to-point interconnect for connecting interface modules to functional 
blocks. 

Interface modules 120-127 are connected to various functional blocks as 
shown. In this embodiment, interface modules 120, 122, 123 and 124 are physically 
separated from their connected functional block (A, B, C, E and 102, respectively). 
Interface modules 121, and 125-128 are essentially part of their respective functional 
blocks or sub-systems. Some functional blocks, such as 102D, do not require a 
dedicated interface module. The arrangement of subsystems, functional blocks 
and interface modules is flexible and is determined by the system designer. 
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In one embodiment there are four fundamental types of functional blocks. 
The four fundamental types are initiator, target; bridge, and snooping blocks. A 
typical target is a memory device, a typical initiator is a central processing unit 
(CPU). Functional blocks all communicate with one another via shared 
communications bus 112 or shared commurucations bus li4 and the protocol of one 
embodiment. Initiator and target functional ofoc^ sha red 
communications bus through interface modules. An initiator functional block may 
communicate with a shared communications bus through an initiator interface 
module and a target functional block may communicate with a shared 
communications bus through a target interface module. 

An initiator interface module issues and receives read and write requests to 
and from functional blocks other than the one with which it is associated. In one 
embodiment/an initiator interface module is typically connected to a CPU, a digital 
signal processing (DSP) core, or a direct memory access (DMA) engine. 

Figure 2 is a block diagram of an embodiment of an initiator interface 
module 800. Initiator interface module 800 includes dock generator 802, data flow 
block 806, arbitrator block 804, address/command decode block 808, configuration 
registers 810, and synchronizer 812: Initiator interface module 800 is connected to a 
shared communications bus 814 arid fo ari mitiltor functional block 816. In one 
embodiment, shared corrunuriications bus 8l4 is a shared communications bus that 
connects sub-systems, as bus 1 12 does in Figure if ^ 

Clock generator 802 is'tised to pietfom clock div^bn when 'initiator 
functional block 816 runs synchronously wim respect to shared communications 
bus 814 but at a different frequencies. When initiator functional block 816 runs 
asynchronously with respect to communications bus 814, clock generator 802 is not 
used, but synchronizer 812 is used. Arbitrator block 804 performs arbitration for 
access to shared communications bus 814. In one embodiment, a multi-level 
arbitration scheme is used wherein arbitrator module 804 includes logic circuits that 
manage pre-allocated bandwidth aspects of first level arbitration and also logic that 
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manages second level arbitration. Data ,flo>y block 806 includes data flow first-in 
first-out (FIFO) buffers between shared commutations bus 814 and initiator 
functional block 816, in addition, to control logic associated with managing a 
transaction between shared communications bus 814 and. initiator functional block 
816. The FIFO buffers stage both the address and.data bits transferred between 
shared commurucations bus 8H In one 

embodiment, shared commuiucatipns bus_|.l^ implements a memory; mapped 
protocol. Specific details of an ^ underlyiri| ^ compu.ter ^us protocol. are not significant 
to the invention, provided that : the i underlying cpn^iputer bus protocol supports 
some operation concurrency. A preferred embodiment of a bus protocol for use 
with the present invention is one that supports retry transactions or split 
transactions, because these protocols provide a mechanism to deliver operation 
concurrency by interrupting a muh^-^cycle trai^ctipn to allow transfers belonging 
to other unrelated transactions to take place. These protocols allow for higher 
transfer efficiencies because independent transactions may use the bus while an 
initiator waits for a long latency target to return data that has. been previously 
requested by an initiator. 

Address/ cpinrnand decode blpc^JOS decpdes an address on shared 
commutations, bus 814 to de^enprubneif^ write is to be performed to registers 
associated with initiator functional block 816. Address/command decode block 808 
also decodes incoming coinmands. ^ru^guratipn registers,810 store bits that 
determine the state of module 8.00, ^ including bandwidth allocation and client 
address base. One register 810 stores an identification (ID) which is a set of bits 
uniquely identifying initiator r functipr^al block 816. . 

Figure 3is a block diagram of an embodiment of a target interface module 
900. Target interface module 9Q0 is connected to shared communications bus 914 
and to target functional block 918. Target.interface module 900 includes clock 
generator 902, data flow block 906, address/cornmand decode block 908,. . 
synchronizer 912, and state registers in state control block 916, Blocks of target 
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interface module 900 that are named similarly to blocks of initiator module 800 
function' in substantially the same way as explained with respect to initiator block 
800. State registers and state control block 916 include registers that store, for 
example, client address base and an identifier for target functional block 918. 

In one embodiment/an initiator functional block such as initiator functional 
block 816 may also act as a target functional block in that it has the capability to 
respond to signals from other functional blocks or sub-systems as well as to initiate 
actions by sending signals to other fimctionalblocks or sub-systems. 

Figure 4 is a block diagram of a part of a computer system 1000 according to 
one embodiment. Figure 4 is useful in illustrating multilevel connection 
identification. System 1000 includes initiator functional block 1002, which is 
connected to initiator interface module 1004 by interconnect 1010. Initiator interface 
module 1004 is connected to target interface module 1006 by shared 
communications bus 1012. Target interface module 1006 is connected to target 
functional block 1008 by an interconnect lbiol Typically, shared communications 
bus 1012 is analogous to shared communications bus 112 of Figure 1 or to shared 
communications bus 114 of Figure 1. Interconnects 1010 are typically analogous to 
interconnect 115 of Figure 1 in that they connect functional blocks to interface 
modules and are point-to-point, rather than shared, interconnects. Interconnects 
1010 are typically physically shorter than shared communications bus 1012 because 
of their local nature. As will be explained more fully below, system 1000 uses two 
different levels of connection identification depending upon the requirements of a 
particular functional block. "Global" connection identification information is sent 
on shared communications bus 1012, while "local" connection information, or 
thread identification information, is sent in interconnects 1010. 

Figure 5 is a block diagram of one embodiment of a shared communications 
bus 1012. Shared communications bus 1012 is shown connected to entities A, B, C, 
D and E, which may be interface modules or functional blocks. Shared 
communications bus 1012 is composed of a set of wires. Data wires 230 provide 
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direct, high efficiency transport of data traffic between functional blocks on shared 
communications bus 1012. In one embodiment, shared communications bus 1012 
supports a bus protocol that is a framed, time division multiplexed, fully pipelined, 
fixed latency communication protocol using separate address, data. and connection 
identification wires. The bus protocol supports fine grained interleaving of 
transfers to enable high operation concurrency, and uses retry transactions to 
efficiently implement read transactions from target devices with long or variable 
latency. Details of the arbitration method used to access shared communications 
bus 1012 are riot required to understand the present invention. The delay from 
when an initiator functional block drives the command and address until the target 
functional block drives the response is known as the latency of shared 
communications bus 1012. The bus protocol supports arbitration among many 
initiator functional blocks and target functional blocks for access to the bus. In the 
embodiment shown, arbitration for access to shared communications bus 1012 is 
performed by an initiator interface module, such as module 1004 of Figure 4. In 
other embodiments, arbitration is performed by functional blocks directly, or by a 
combination of interface modules and functional blocks- In one embodiment, a bus 
grant lasts for one pipelined bus cycle. The protocol does not forbid a single 
functional block from becoming a bus owner for consecutive bus cycles, but does 
require that the functional block successfully win arbitration on consecutive cycles 
to earn the right. 

Shared communications bus 1012 includes separate address, data, and 
control wires. Other embodiments may include multiplexed address, data, and 
control signals that share a wire or wires. Such an embodiment would provide high 
per-wire transfer efficiency because wires are shared among; address and data 
transfers. A non-multiplexed embodiment of shared communications bus 1012 may 
be more appropriate for communication between functional blocks on a single 
integrated circuit chip because wires are relatively inexpensive and performance 
requirements are usually higher on a single integrated circuit chip. 
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Clock line 220 is a global signal wire that provides a time reference signal to 
which all other shared communications bus 1012 signals are synchronized. Reset 
line 222 is a global signal wire that forces each connected functional block into a 
default state from which system configuration may begin. Command line 224 
carries a multi-bit signal driven by an initiator bus owner. In various embodiments, 
the multi-bit command signal may convey various types of information. For 
example, a command signal may indicate a transfer type, information regarding 
duration of a connection, and expected initiator and target behavior during the 
connection. In one embodiment, the command signal includes one or more bits 
indicating the beginning and end of a connection. In one embodiment, for example, 
one bit may indicate the status of a connection. If the bit is zero, the current transfer 
is the final transfer in the connection. After the receipt of a zero connection status 
bit, the next receipt of a connection status bit that is a logic one indicates that the 
transfer is the first in a newly opened connection. Each subsequently received one 
connection status bit then indicates that the connection is still open. 

Supported transfer types in this embodiment include, but are not limited to 
read and write transfers. Address lines 228 carry a multi-bit signal driven by an 
initiator bus owner to specify the address of the object to be read or written during 
the current transfer. Response lines 232 carry a multi-bit signal driven by a target to 
indicate the status of the current transfer. Supported responses include, but are not 
limited to the following responses. A NULL response indicates that the current 
transfer is to be aborted, presumably because the address does not select any target. 
A data valid and accepted (DVA) response indicates, in the case of a read, that the 
target is returning requested data on data lines 230. In the case of a write, a DVA 
response indicates that the target is accepting the provided data from data lines 230. 
A BUSY response indicates that the selected target has a resource conflict and 
cannot service the current request. In this case an initiator should reattempt the 
transfer again later. A RETRY response indicates that the selected target could not 
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deliver the requested read data in time, but promises to do so at a later, time. In this 
case an initiator must reattempt the transfer at a later time. 

Connection identifier (CONNID) lines 226 carry a multi-bit signal driven by 
an initiator bus owner to indicate which connection the current transfer is part of. 
A connection is a logical state, established by an initiator, in which data may pass 
between the initiator and an associated tarjget. The CONNID typically transmits 
information including the identity of the functional block initiating the transfer and 
ordering information regarding an order in which the transfer must be processed. 
In one embodiment, the information conveyed by the CONNID includes 
information regarding the priority of the transfer with respect to other transfers. In 
one embodiment the CONNID is a eight-bit code. An initiator interface module 
sends a unique CONNID along with an initial address transfer of a connection. 
Later transfers associated with this connection (for example, data transfers) also 
provide the CONNID value so both sender and receiver (as well as any device 
monitoring transfers on shared communications bus 1012) can unambiguously 
identify transfers associated with the connection. One advantage of using a 
CONNID is that transfers belonging to different transactions can be interleaved 
arbitrarily between multiple devices on a per cycle basis- In one embodiment, 
shared communications bus 1012 implements a fully pipelined protocol that 
requires strict control over transaction ordering; in order to guarantee proper system 
operation. Without the use of a CONNID, ordering constraints within a particular 
transaction may be violated because transfers associated with a particular 
connection are not identified. 

Because a first command may be rejected by a BUSY response while a later 
command is already in flight, it is essential to provide mechanisms that allow full 
control over which commands complete. If such control is not present, ambiguous 
system behavior can result- For instance, if a single initiator interface module issues 
a sequence of dependent read and write commands, a busy response to one of the 
commands could result in later commands returning the wrong data. One solution 
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to such problems is to avoid overlapping dependent commands. This solution, 
however; increases the latency of every dependent command in order to ensure 
proper results. The present invention uses - a CONNID signal, in part, to allow 
overlapping of dependent commands. Therefore, use of a CONNID improves 
system performance and efficiency. Another benefit of the CONNID of the present 
invention is that communication system predictability is enhanced because it allows 
a shared functional block to respond to requests based upon quality of service 
guarantees that may vary between connections. For example * ata requested to 
operate a computer display cannot tolerate unpredictable dei >ecause delay 
causes the display to flicker. Therefore, the CONNID may be used to prioritize data 
requests from a display controller so that requests from the display controller to a 
common resource are serviced before other requests. The present invention also 
allows for flexible reconfiguration of ''the CO^MD to retune system performance. 

Figure 6 is a timing diagram of a pipelined write transaction consisting of 
two write transfers on shared communications bus 1012. Reference may also be 
made to Figure 5. A single pipelined bus transfer, as shown in Figure 6, includes an 
arbitration cycle (not shown), followed by a command/address/CONNID (CMD 
324/ADDR 328 / CONNID 326) cycle (referred as a request, or REQ cycle), and 
completed by a DATA 330/RESP 342 cycle {referred to as a response, or RESP 
cycle). In one embodiment, the number ; of cy des between a REQ cycle and a RESP 
cycle is chosen at system irnplemientatibn t^eMsed upon the operating frequency 
and module latencies to Optimize syst^ p^formahce. The REQ-RESP latency, in 
one embodiment, is two cycles and is latelecl abdve the DATA 330 signal line on 
Figure 6. Therefore, a complete transfer time includes four shared communications 
bus 1012 cycles, arbitration, request, delay and response. 

two transfers are shown in Figure 6. On cycle 1, initiator E drives REQ fields 
340 to request a WRITE transfer to address ADDREO. This process is referred to as 
issuing me transfer request. In one embodiment, a single target is selected to 
receive the write data by decoding an external address portion of ADDREO. On 
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cycje 3 (a REQ-RESP latency la t^r), i^h^tor E drives wrjute dat&DATAEQ on the 
DATA wires; simultaneously, % A' drives RES? wires 342 with the 

DVA code, indicating that A; accept the write, data. By the.end of cycle 3, target A 
has acquired the write data, and initiator E delects that target A was able to accept 
^ e write data / and the tr£^ech^j^wcpi^p^teAsucc^.f>illy. 

Mean .^! e (}?-. f »\cycle 3),,in^iator E issues ^pipelined WRITE transfer 
(address ADDRJEl) to target A. ; The write daU and garget response for this transfer 
both occur on cycle 5, where the fransfer completes successfully, Proper operation 
of man y systems ^^and sub-systeins rely on the proper prdering of related transfers. 
Thus, proper system operation may require that the cycle 3 WRITE complete after 
the cycle 1 WRITE transfer. .In , Figure 6, the CQNNIP field conveys crucial 
information about the origin of the transfer that can be used to enforce proper 
ordering. A preferred embodirrje^. of prdering restrictions is that the initiator and 
target collaborate tp ensure ^rpr|er ordering, even durmg pipelined transfers. This 
is important, because transfer pipe^nmg reduces the total latency of a set of 
transfers (perhaps a single txansaction),, thus improving system performance (by 
reducing latency and increasing usable band^wM 

According ;to the algorithm of ^ ! pne^erriA)p<iiment;^^^ , . 

1. An initiator may. issue a transfer Y:- 

a) if transfer Y is the oldest non-Issued/ npn-retired transfer 
among the set of transfer requests it has with matching CONNID, or 

. ^ b). Jf^of tte^ 
are currently issued to the s^gte ^get as transfer Y. If issued under this provision, 
transfer Y is considered pipelined with the old^r non-retired transfers. 

2. A target that responds to a transfer X in such a way that the initiator 
might not retire the transfer must respond BUSY to all later transfers with the same 
CONNID as transfer X that are pipelined with X. 

Note that an older transfer Y that is issued after a newer transfer X with 
matching CONNID is not considered pipelined with X, even if Y Issues before X 
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completes. This situation is illustrated in Figure 7. If target A has a resource 
conflict that temporarily prevents it from accepting DATAEO associated with the 
WRITE ADDREO from cycle 1, then A responds BUSY. Step 2 of the foregoing 
algorithm requires that A also reject (using BUSY) any other pipelined transfers 
from me same CONNID (in this case;COr^ 

possibly know about the resource conflict until affer the REQ-RESP latency has 
passed. Thus, target A must BUSY the WRITE ADORE! that is issued in cycle 3, 
because it has the same CONNID and was issued before the initiator could interpret 
the BUSY response to the first write transfer, and is thus a pipelined transfer. 
Furthermore, the second attempt (issued in cycle 4) of the WRITE ADDREO transfer 
is allowed to complete because it is not a pipelined transfer, even though it overlaps 
the cycle 3 WRITE ADDREI transfer. 

Note that target A determines that me cycle 4 write is not pipelined with any 
earlier transfers because of when it occurs and which CONNID it presents, and not 
because of either the CMD nor the ADDR values. Step 1 of the algorithm 
guarantees that an initiator Will only issue a transfer that is the oldest non-issued, 
non-retired transfer within a given connection! Thus, once the first WRITE 
ADDREO receives the BUSY response in i cycle $ it is no longer issued, and so it 
becomes the only CONNID - 1 transfer eligibletor issuei It is therefore impossible 
for a properly operating initiator to issue a pipelined transfer in cycle 4, given that 
an initial cycle 1 transfer received a BUSY response and the REQ-RESP latency is 
two cycles. ......... 

On^e embodiment of me ^ 
of the desired transfers within a given CONNED. Each transfer is marked as non- 
issued and non-retired as they are entered mto the queue. It is further marked as 
pipelined if the immediately older entry in the queue is non-retired and addresses 
the same target; otherwise, the new transfer is marked non-pipelined. Each time a 
transfer issues it is marked as issued. When a transfer completes (i.e., when the 
RESP cycle is finished) the transfer is marked non-issued. If the transfer completes 
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successfully, it is marked as retired and may be deleted from the queues If the 
transfer does not complete successfully, it mil typically be re-attempted, and thus 
can go back into arbitration, for re-issue. If the transfer does not complete : 
successfully, and it will not be rera|tempted, then it should not be, marked as retired 
until the next transfer, if it exjsfcSj ^.^t^mar^ed,aS;iss^v^This restriction prevents 
the initiator logic from issuing out of order,. As % the pldest npn-Retired transfer 
issues, it. is marked as issued. This r allows the second-oldest non-retired transfer to 
arbitrate to issue until the older transfer completes <<md is thus marked as non- 
issued), if it is marked as pipelined. 3 ....... .... : ..... 

, . Aji embp<^in)ent of the target implementation maintains, a time-ordered 
queue whose depth matches the REQ-RESP latency. The queue operates off of the 
bus clock, and the oldest entry in the queue is retired on each bus cycle; 
simultaneously, a new entry is added to the queue on each bus cycle. The CONNID 
from the current REQ phase is copied into the new queue entry. In addition, if the 
current REQ phase contains a valid transfer that selects the target (via the External 
Address), then "first" and ;i>ugy M fields in the new queue entry may be set; 
otherwise, the first and busy bijts are cleared. The first bit will be set if the current 
transfer will receive a BUSY xesppme ^ (due to a rewurce conflict) and no earlier 

transfer in the queue has the same CONNID and has its first bit set. The first bit 

■ ... .u ; 'fo? si-r.'t^.? L v "« *iVi-::n-:r . . ■ ■ ^ ■■ 

implies that the current transfer is the first of a set of potentially-pipelined transfers 
that will need to be BUSY'd to enforce oxdexing- The busy bit is set if either the 
target has a resource conflict or one of the earlier transfers in the queue has the 
same CONNID and has the first bit set. This logic enforces the REQ-RESP pipeline 
latency, ensuring that the target accepts x\p pipelined transfers until the initiator can 
react to the BUSY response to tljte transfer mark^ tet 

Application of the algorithm to the initiators and targets in the 
communication system provides the ability to pipeline transfers (which increases 
per-connection bandwidth and reduces total transaction latency) while maintaining 
transaction ordering. The algorithm therefore facilitates high per-cdnnectipn 

-17- 



WO 00/29961 



PCT/US99/26901 



performance. The fundamental interleaved structure of the pipelined bus allows 
for lugh system performance, because multiple logical transactions may overlap one 
another, thus allowing sustained system bandwidth that exceeds the peak per- 
connection bandwidtns. For instance, Figure 8 demonstrates a system 
configuration in which initiator E needs to transfer data to target A on every other 
bus cycle, while initiator D requests data from iarg;et B on every other bus cycle. 
Since the communication system supports fine interleaving (per bus cycle), the 
transactions are composed of individual transfers that issue at the natural data rate 
of the functional blocks; this reduces buffering requirements in the functional 
blocks, and thus reduces system cost. The total system bandwidth in this example 
is twice the peak bandwidth of any of the functional blocks, and thus high system 
performance is realized. 

The present invention adds additional system-level improvements in the 
area of efficiency and predictability. First, the connection identifier allows the 
target to be selective in which requests it must reject to preserve in-order operation. 
The system only need guarantee ordering among transfers with the same CONNID, 
so the target must reject (using BUSY) only pipelined transfers. This means that the 
target may accept transfers presented with other 'CONNID values even while 
rejecting a particular CONNID. This situation is presented in Figure 9, which adds 
an interleaved read transfer from initiator D to me pipelined write transfer of 
Figure 7. All four transfers in Figure '9 » select target A, and A has a resource conflict 
that prevents successful completion of me WRITE ADDREO that issues in cycle 1. 
While the rejection of the first write prevents A from accepting any other transfers 
from CONNID 1 until cycle 4, A may accept the unrelated READ ADDRDO request 
of cycle 2 if A has sufficient resources. Thus, overall system efficiency is increased, 
since fewer bus cycles are wasted (as would be the case if target A could not 
distinguish between connections). 

Second, in one embodiment the connection identifier allows the target to 
choose which requests it rejects. The target may associate meanings such as transfer 
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Parity to the CONNIE) values, an4 therefore dedde,wluch ; requests,to act upon 

based upon a combination of the CONNID value and the internal state of the target. 

For instance, a target rnighthaye separate queues for storing transfer requests of 

different priorities. Referring to Figure 9, the target might have a queue for. low 

priority requests (which present with an odd CONNID) and a queue for high 

priority requests i (whiAjgres^ wth even CONNID). Thus, the CONNID 1 

WRITE ADDREO request of cycle. 1 would be rejected if the low-priority queue were 

full, whereas the CONNID 2 READ ADDRDO transfer could be completed 

successfully based upon available high : priprity qugue resources. Such differences 

in transfer priorities are very common in highly-integrated electronic systems, and 

the ability for the target to deliver higher quality of service to higher priority 

transfer requests adds significantly to the overall predictability of the system. 

As Figure 9 implies, the algorithm described above allows a target to actively 

satisfy transfer requests from multiple CONNID values at the same time. Thus, 

there may be multiple logical transactions in flight to and/or from the same target, 

provided that they have separate CONNIE) values. Thus, the present invention 

supports multiple connections per target. fur^onal block. 
. Adding^ 

transactions to the cqmmujnications system at the same time.. Such a capability is 
very useful for initiator such.as direct memory acce^ (DMA) devices, which 
transfer data between two .targerts ; Jn such an appUca^orj^ the DMA initiator would 
present a read tran$actipn using a first CONNID to a first target that is the source of 
the data, and nirthermore present a write transaction using a second CONNID to a 
second target that is the data destination. At the transfer level, the read and write 
transfers could be interleaved. This reduces the amount of data storage in the DMA 
initiator, thus reducing; system cost. Such an arrangement is shown in Figure 10, 
where initiator E interleaves pipelined read transfers from target A with pipelined 
write transfers to target B. Thus, the present invention supports multiple 
connections per initiator functional block. 
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The control structures required to support implementation of the present 
invention; as described above with respect to the algorithm, are simple and require 
much less area than the data buffering area associated with traditional protocols 
that do not provide efficient fine interleaving of transfers. Thus, the present 
invention minimizes communication system area and complexity, while delivering 
high performance and flexibility. 

Finally, the CONNID values that are associated with particular initiator 
transactions should typically be chosen to provide useful information such as 
transfer priorities but also to minimize implementation cost. It is useful to choose 
the specific CONNID values at system design time, so the values can be guaranteed 
to be unique and can be ordered to simplify comparison and other operations. 
Furthermore, it is frequently useful to be able to change the CONNID values during 
operation of the communications system so as to alter the performance and 
predictability aspects of the system. Preferred implementations of the present 
invention enable flexible system configuration by storing the CONNID values in 
ROM or RAM resources of the functional blocks, so they may be readily re- 
configured at either system build time or system run time. 

Figure 11 shows an interconnect lOlO^which is a point-to-point interconnect 
as shown in Figure 4. Interconnect 1010 includes additional.signals as compared to 
the protocol described with reference to Figure 5. As will be explained below, some 
of the additional signals are particularly useful as signals sent oyer point-to-point 
interconnects such as interconnects 1010. The protocol of interconnect 1010 controls 
point-to-point transfers between a master entity 1102 and a slave entity 1104 over a 
dedicated (non-shared) interconnect. Referring to Figure 5, a master entity mav be, 
for example, initiator functional block 1002 or target interface module 1006. A slave 
entity may be, for example, initiator interface module 1004 or target functional 
block 1008. 
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Signals shown in Figure 11 are labeled with signal names. In addition, some 
signal names are followed by a notation op notations^ parentheses or brackets. 
The notations are "as follows: 1 ' 

(F) The signal is i' optional and is i independently configurable 
(A) The signal must be configured togemer with signals having similar 
• 'notations "' ''' ; ' k " c * 1 '~ f •' ; - - : ' - 

(AI) The signal is independently configurable if (A) interface modules exist 
[#] Maximum signal width' 

The dock signal is the cloclcof a connected functional block. The command 
(Cmd) signal indicates the type of transfer on the bus. Commands can be issued 
mdeperidefitof data. The address (Addr) signal is typically an indication of a 
particular resource that ah initiator runctibnal block wishes to access. Request 
Accept (ReqAccept) is a handshake signal whereby slave 1104 allows master 1102 to 
release Cmd, Addr and DataOut from one transfer and reuse them for another 
transfer. If slave 1104 is busy arid cannot participate in a requested transfer, master 
1102 must continue to present Cmd, Addr arid DataOut. DataOut is data sent from 
a master to a slave, typically in a wnte transfer.' Dataln typically carries read data. 

Response (Resp) and Dataln are signals sent from slave 1104 to master 1102. 
Resp indicates that a transfer request that was received by slave 1104 has been 
serviced. Response accept (RespAccept) is a 'h^d^ate'signalused to indicate 
whether the master allows tKe slave.ito release Resp and Dat^ 

Signals Clc»dk,Cmd/'Ad^ 
RespAccept, in one embodiment; make up a basic set of interface module signals. 
For some functional blocks, the basic^set'may be adequate for communication 
purposes. ; '», ' v o- ! -- A ' 1 '-*•' ; - '"' " 

In other embodiments, some or ail of the remaining signals of bus 1012 may 
be used. In one embodiment, Width is a three-bit signal that indicates a width of a 
transfer and is useful in a connection that includes transfers of variable width. 
Burst is a multibit signal that allow individual commands to be associated within a 
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connection. Burst provides an indication of the nature of future transfers, such as 
how many there will be and any address patterns to be expected. Burst has a 
standard end marker. Some bits of the Burst field are reserved for user-defined 
fields, so that a connection may be ignorant of some specific protocol details within 
a connection. 

Interrupt and error signals are an important part of most computer systems. 
Interrupt and error signals generated by initiator or target functional blocks are 
shown, but the description of their functionality is dependent upon the nature of a 
particular functional block and is not important to understanding the invention. 

Request Thread Identifier (ReqThreadID), in one embodiment, is a four-bit 
signal that provides the thread number associated with a current transaction 
intended for slave 1104- All commands executed with a particular thread ID must 
execute in order with respect to one another, but they may execute out of order with 
respect to commands from other threads. Response Thread Identifier 
(RespThreadID) provides a thread number associated with a current response. 
Because responses in a thread may return out of order with respect to other threads, 
RespThreadID is necessary to identify which thread's command is being responded 
to. In one embodiment, ReqThreadID and RespThreadID are optional signals, but if 
one is used, both must be used. 

Request Thread Busy (ReqThreadBusy) allows the slave to indicate to the 
master that it cannot take any new requests associated with certain threads. In one 
embodiment, the ReqThreadBiisy signal is a vector having one signal per thread, 
and a signal asserted indicates that the associated thread is busy. 

Response Thread Busy (RespThreadBusy) allows the master to indicate to 
the slave that it cannot take any responses (e.g., on reads) associated with certain 
threads. The RespThreadBusy signal is a vector having one signal per thread, and a 
signal asserted indicates that the associated thread is busy. 

Request Connection Identifier (ReqCoruiTD) provides the CONNID 
associated with the current transaction intended for the target. CONNIDs provide 
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a mechanism by which a system entity may associate particular transactions with 
the system entity. One use of the CONNID is in establishing request priority 
among various initiators. Another use is in associating actions or data transfers 
with initiator identity rather thain the address presented with the transaction 
request. 

The embodiment of Figure 11 provides end-to-end connection identification 
with CONNID as well as point-to-point, or more local identification with Thread 
ID. A Thread ID is an identifier of local scope that simply identifies transfers 
between the interface module and its connected functional block. In contrast the 
CONNID is an identifier of globaj scope that identifies transfers between two 
interface modules (and, if required, their connected functional blocks). 

A Thread ID should be small enough to directly index tables within the 
connected interface module and functional block In contrast, there are usually 
more CONNIDs in a system than any one interface module is prepared to 
simultaneously accept. Using a CONNID in place of a Thread ID requires 
expensive matching logic in the interface module to associate a returned CONNID 
with specific requests or buffer entries, . 

Using a networking analogy, the Thread ID is a level-2 (data link layer) 
concept, whereas the CONNID is more like a level-3 (transport/session layer) 
concept. Some functional blocks only operate at level-2, so it is undesirable to 
burden the functional block or its interface module with the expense of dealing with 



level-3 resources. Alternatively, some functional blocks need the features of level-3 
connections, so in this case it is practical to pass the CONNID through to the 
functional block. 

Referring to Figure 4, a CONNID is required to be unique when transferred 
between interface modules 1004 and 1006 on shared communications bus 1012. The 
CONNID may be sent over a local interconnect, such as interconnect 1010- In many 
cases, however, it is much more efficient to use only Thread ID between a 
functional block and its interface module. For example initiator functional block 
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1002 may not require all the information "provided by the CONNID. Also, in some 
systems, multiple identical initiator functional blocks 1002 may exist with the same 
CONNID so that a particular target functional block 1008 receiving a transfer will 
not know which connection it is actually par t"6f Unless logic in initiator interface 
module 1004 translates the "local" CONNID to a unique "global" CONNID. The 
design and implementation of such a te^^oft^unctib^ty'in^ah interface 
module is complicated and expensive; In such cases, the CONNID may be sent 
between interface modules over shared communications bus s \2 while the Thread 
ID is sent between a functional block arid ah interface module 

In the case of an irutiator functional block, a one-to-one static correspondence 
may exist between Thread ID and CONNID. For example if the Thread ID is "1", a 
single CONNID is mapped for a particular mterf ace module, solving the problem of 
multiple, identical functional blocks. 1 ■ " 

In the case of a target functional block; there is a one-to-one dynamic 
correspondence between a Thread ID and a CONNID. If a target functional block 
supports two simultaneous threads, die target interface module acquires the 
CONNID of an open connection and associates it with a thread as needed. For 
example/a target interface module receives^ COlNJNID'of 'T*, and tlien maps 
CONNID 7 to thread "0". Thereafter, all brarisfers with CONNID 7 are associated 
with threadO until connection 7 is dosed. 

Referring to Figure 12> ah exampteof a-use : oftllfead ID, consider a series of 
identical direct memory access (DMA) engmes m r a s^tem; In Figure 12, elements 
1202 are identical ©MA engines, each connected to an initiator interface module 
1204. initiator interface modules 1204 are connected to shared communications bus 
1212. Target interface module 1206 is also connected to shared corrimunications bus 
1212 and transmits data from bus 1212 to DRAM controller 1208, which is a target 
functional block. Target interface module 1206 is connected to DRAM controller 
1208 by interconnect 1214. DRAM controller 1208 controls access to DRAM 1213. 
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,,. l.DMA a engine js.an.exampie of an initiator.functional block that also 
functions, as a target functional block,. WHen me.DMA engine is programmed by 
software, i^ acts as a target.. Thereafter, the, DMA engine is an initiator. Because a 
DMA engine perforrn^ both rea^and write operations, two connections can be 
associated with a single J$jl&#f^ is available in the DMA 

f$&^r*%Af^.W$te ppera&oi£>ma# bedecoupled scr that both types of 
operatiojis can b^ perforrried concurrently. ; A read maysoccur.from.a long latency 
storage.device which requires, the read date to be buffered on the, DMA engine 
before a write operation writes the data. In one embqdiment/each of DMA engines 
1202 uses a Thread ID to identify the read stream and a different Thread ID to 
identify Jthe write stream. The DMA, engine does not require more information, 
such as what pmer function^ blo,ck^ : partiapates in a ; transactiQn. Therefore, a 
CONNID is not required to be sent from .the DMA: engine 1202 to a connected 
interface module 1204. Magping. of a Thread ED to a CONNID occurs in the 
interface module 1204. ; r <, r . <j-. . 

In one embodiment, etu^^i^tiator interface ^mpdule 1204 maps a unique 
CONF^to^^ Each of ~ 

QM^ engines 12G2]Lise a singl^ biypr .exhale. Thread ID of Figure 11, to 
distinguish between its >{vvq tjjre^d^ igor transfer ov^r. shared conununications 
bus a unique CONNID is sent to target wteitface, mqdule!206. The CONNID may 
include priority ir^Qnyiatjpij, £)£ e^^mjd?, ^igrdngWgh priori ty to requests for 
graphics data. The lu^ by 
DRAM controller 1?08 while lo>ye^ may be required to wait 

Because intelligence is desigiied into the interface modules and the 
communication protocols, le^intelligence is required of the functional block such 
as the DRAM controller 1208 and the DMA engines, 12Q2. This has theadvantage of 
making functional blocks. more portable or r^^ For 
example, a DMA engine used for a high priority application may be switched with 
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a DMA engine used for a lower priority application simply by changing their 
respective connected interface modules. 

In one embodiment, target and initiator interface modules are programmed 
at the transistor level so that their precise function, including their CONNID 
assignment, is fixed at power-up. In another embodiment, the design of interface 
modules is in RAM so that the interface module is a reprogrammable resource. In 
this case, the interface module is reprogrammed, including reassignment of 
CONNIDs, by software. 

The present invention has been described in terms of specific embodiments. 
For example, embodiments of the present invention have been shown as systems of 
particular configurations, including protocols having particular signals. One of 
ordinary skill in the art will recognize that different system configurations and 
different signals may be used without departing from the spirit and scope of the 
invention as set forth in the claims. 
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IN THE CLAIMS 
What is claimed is: 



1 . A communication system comprising: 

at least two functional blp^, wherein ^an first functional block . 
communicates with a secpnd functional block by establishing a connection, wherein 
a connection is a logical state in which data may pass between the first functional 
block and the second functional block; and \ 

a bus coupled to each of the functional blocks and configured to carry a 
plurality of signals, wherein the plurality of signals comprises a connection 
identifier that indicates a particular connection that a data transfer is part of. 

2. The communication system of claim. 1, wherein the plurality of signals 
further comprises a thread identifier that indicates a transaction stream that the 
data transfer is part of. 

3. The communication system of claim 2, further comprising: 
an initiator functional block that sends transfer requests; 

an initiator interface module coupled to the initiator functional block and to 
the bus; 

a target functional block that responds to transfer requests; and 

a target interface module coupled to the target functional block, wherein the 

connection identifier is sent with a transfer request from the initiator interface 

module to the fcrget interface module. 

4. The communication system of claim 3, wherein the thread identifier is 
sent from the target interface module to the target functional block and from the 
initiator interface module to the initiator functional block, 
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5. The communication system of claim 4, wherein the connection 
identifier is sent from the target interface module to the target functional block and 
from the initiator interface module to the initiator Afunctional block. 

6. The communication system of daim 5 wherei 
identifier is a multi-bit value that encodes information including: 

a transfer priority; 
a transfer order; and 

an functional block that originated the transfer. 

7. The communication system of claim 6, wherein the connection 
identifier is one of a plurality of connection identifiers associated with an initiator 
functional block, and wherein the connection identifier is mapped to a thread 
identifier by the initiator interface module. 

8. The communication system of daim 7, wherein the connection 
identifier is one of a plurality of connection identifiers associated with a target 
functional block that supports simultaneous connections, and wherein the target 
functional block acquires a connection identifier of ah open connection and maps 
the connection icienti^ 

9. The communication system of (daiin 2, wherein the plurality of signals 
further comprises: 

a request thread ID signal that indicates a thread number assodated with a 
current transaction intended for a target functional block; 

a response thread ID signal that indicates a thread that a transfer from the 
target functional block is part of; 
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a request thread busy signal that indicates that indicates to an initiator 
functional block that the target functional block cannot receiy?.new requests 
associated with certain threads;.and , t 

a response thread busy signal that indicates that the initiator functional block 
cannot receive any new responses from the target functional block that are 
associated with certain threads, ; , . 

10. A method for communicating between functional blocks in a 
computer system, the method comprising the steps of: 

establishing a plurality pf connection identifiers/ wherein each connection 
identifier associates a particular data transfer with a particular connection, wherein 
a connection is a logical state in .which d^ta may pass between an initiator 
functional block of the plurali^ of ^ctional blodcs and a target functional block of 
the plurality of functional bloc^, and wherein a connection is established when a 
particular data transfer is initiated; and. 

allowing an initiator functional block to issue a first transfer "Y" if the 
transfer "Y" is an oldest non-issued, non-retired transfer among a set of transfer 
requests with a same connection identifier as the transfer "Y". 

11. The method of claiijT 10, further comprising the step of allowing the 
initiator functional block to issue the transfer T if every non-retired transfer with 
the same connection identifier is older than the transfer "Y" and is currently issued 
to a s^e target functional bio^ a^ 

12. The method of .claim 11, wherein if the transfer "Y" is issued, the 
transfer "Y" is considered pipelined with the older, non-retired transfers. 

13. The method of claim 10, further comprising the step of the target 
functional block giving a BUSY response to every later transfer that is pipelined 
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with the transfer "X" and has a same connection identifier as the transfer "X" if the 
target functional block responds to the transfer :, X" so that an initiator initiating the 
transfer "X" may not retire the transfer "X". 

14. The method of claim 13/wherein^ k transfer ,f Y M that is issued after a 
transfer "X", is older than the transfer '^'Mh^ identifier as 
the transfer "X" is considered not pipelined with the transaction "X M . 

15. The methoid of claim 12, wherein a target functional block determines 
whether a transfer is a pipelined transfer based upon when the transfer occurs and 
upon a connection identifier associated with the transfer. 

16. The method of claim 14, ftirthter comprising the steps of: 

an initiator functional block maintaining a time-ordered queue of desired 
transfers with a same connection identified; 

the initiating functional block inking a transfer as non-issued and non- 
retired as it is entered into the queue. 

17. TTie method of claim 16, further comprising the steps of: 

if a next oldest entry is nbn-retired and addresses a same target functional 
block, marking the transfer as pipeiih€d; ^lste: : ^^^^^" • v 

marking the if kiisifCT las i^n-pipe 1 ' 

18. The method of claim 17, further comprising the step of, when a 

...... , ^. ' . - . . 

transfer issues, marking the transfer as issued! 

19. The method of claim 18, fur ther comprising the step of, when a 
transfer is completed, marking the transfer as non-issued. 
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v 20. The method of claiir* 19,£urther comprising the step of, if the transfer 
is successfully completed, marking the transfer as retired; and deleting, the transfer 
from the queue. 

21. The petjt^^ the step of, if the transfer 
is not successfully cpi^pleted. reattemp ting the transfer. 

22. The method of claim 14, further comprising the step of the target 
functional .block- maintaining a tirne-qrdered queue having a depth that is a number 
of bus. dock cycles between a request for a transfer and a response to the request. 

23. The method of claim 22, further comprising the steps of: 

on each cycle of th$ bi^.clpq k,, retiring an oldest entry in the time-ordered 
queue; and . : -v-.. 

on each cycle of the bus do<^ #d^g a new e^try to the time-ordered queue, 
induding a connection iden^era^qdated ^ith a current request for a transfer. 

24. The method of daim 23, further comprising the steps of: 

if a current request for a transfer contain s a valid transfer that selects the 
target functional FI|^J bit and a BUSY bit of an entry in the time- 

ordered queue to be set, whereirua set HR£X bit .iip|?]|^jlia( an assodated transfer 
is a first transfer of a set of poten^^y^ipeli^ed transfers; else 

dearing the FIRST bit and the BUSY bit. 

25* the method of dauri 24, further comprising the step of setting the 
FIRST bit if: 

. no transfer in the time-ordered queue is earlier than a current transfer, has a 
same connection identifier as the current transfer and has an associated FIRST bit 
set; and 
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the current transfer will receive a BUSY response due to a resource conflict. 

26. The method of claim 24, further comprising the step of setting the 
BUSY bit if: 

the target functional block has a respurce conflict; or 
an earlier transfer in the rime-ordered queue has an associated FIRST bit set 
and has a same connection identifier as a current transfer. ; . 

27. ; The method of claim 26, further comprising the step of using a 
connection identifier to enforce ordering among transfers. 

28. The method of claim 27, further comprising the step of: 

in response to a first request for a data "transfer issued in a first bus cycle, the 
target functional block setting a BUSY bit in a firstrirhe-ordered queue entry, 
wherein a first connection identifier is associated witfvthe first request; and 

in response to a second request for a data transfer in a next bus cycle 
subsequent to the first bus cycle, the target functional block clearing a BUSY bit in a 
second time-ordered queue entry and performing an action in connection with 
executing the data transfer requested in the' second request. 
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